Before starting, you will need the following R packages.
The following is a list of packages that are used throughout this book that need to be loaded before any analysis. A complete list of all the packages used in the book can be found in Chapter 5.
Code
# General Packages:library(tidyverse)library(tools)library(parallel)library(boot) library(table1)# Packages for Plotting:library(ggplot2)library(cowplot) library(ComplexHeatmap) library(ggh4x)# Packages for High Dimensional Mediation:library(HIMA)library(xtune)library(RMediation)library(glmnet)# Packages for Mediation with Latent Factors:library(r.jive)# Packages for Quasi-mediation:library(LUCIDus)library(mclust)library(networkD3)library(plotly)library(htmlwidgets)library(glasso)library(nnet)library(progress)library(jsonlite)
In order to replicate the style of the figures in this book, you will also have to set the ggplot theme:
Code
ggplot2::theme_set(cowplot::theme_cowplot())
1.2 Custom Functions
The analyses in this book rely on several custom functions. The code for functions are provided in Chapter 6.
1.3 The Data
The data used in this project is based off of simulated data from the Human Early Life Exposome (HELIX) cohort (Vrijheid et al. 2014). The data was simulated for one exposure, five omics layers, and one continuous outcome (after publication, this data will be available on github). The format of this data is a named list with 6 elements. It includes separate numeric matrices for each of the 5 omics layers, as well as the exposure and phenotype data. In all datasets in the list, the rows represent individuals and the columns represent omics features. In this analysis, the exposure and outcome are:
In the simulated data, each 1 standard deviation increase in maternal mercury was associated with a 0.11 standard deviation increase in CK18 enzymes (Figure 1.1; p=0.02), after adjusting for child age and child sex.
Code
ggplot(data = simulated_data[["phenotype"]], aes(x = hs_hg_m_scaled, y = ck18_scaled)) +geom_point() +stat_smooth(method ="lm",formula = y ~ x ,geom ="smooth") +xlab("Maternal Mercury Exposure (Scaled)") +ylab("CK-18 Levels (Scaled)")
Figure 1.1: Association between maternal mercury and CK18 in the Simulated Data
1.3.3 Correlation of omics features
Figure 1.2 shows the correlation within and between the omics layers in the simulated data.
Code
# Change omics list elements to dataframesomics_df <- purrr::map(omics_lst, ~as_tibble(.x, rownames ="name")) %>% purrr::reduce(left_join, by ="name") %>%column_to_rownames("name")meta_df <-imap_dfr(purrr::map(omics_lst, ~as_tibble(.x)),~tibble(omic_layer = .y, ftr_name =names(.x)))# Correlation Matrixcormat <-cor(omics_df, method ="pearson")# Annotationsannotation <-data.frame(ftr_name =colnames(cormat),index =1:ncol(cormat)) %>%left_join(meta_df, by ="ftr_name") %>%mutate(omic_layer =toTitleCase(omic_layer))# Make PlotHeatmap(cormat, row_split = annotation$omic_layer,column_split = annotation$omic_layer,show_row_names =FALSE,show_column_names =FALSE, column_title_gp =gpar(fontsize =12),row_title_gp =gpar(fontsize =12),heatmap_legend_param =list(title ="Correlation"))
Figure 1.2: Heatmap illustrating the correlation of molecular features within and between different omics layers.
Vrijheid, Martine, Rémy Slama, Oliver Robinson, Leda Chatzi, Muireann Coen, Peter van den Hazel, Cathrine Thomsen, et al. 2014. “The Human Early-Life Exposome (HELIX): Project Rationale and Design.” Journal Article. Environmental Health Perspectives 122 (6): 535–44. https://doi.org/10.1289/ehp.1307204.